Finding Reverse Substrings in DNA
نویسندگان
چکیده
Extended Abstract In DNA, one section of a string may be reversed. Since this occurs naturally, it would be beneficial to be able to match two strings and note where these reverse subsequences occur. The K-algorithm provides a way to detect reverse substrings. It has worst-case time complexity of O(m3/2n), yet performs quickly in application due to its parallelistic nature.
منابع مشابه
A cost-aggregating integer linear program for motif finding
In the motif finding problem one seeks a set of mutually similar substrings within a collection of biological sequences. This is an important and widely-studied problem, as such shared motifs in DNA often correspond to regulatory elements. We study a combinatorial framework where the goal is to find substrings of a given length such that the sum of their pairwise distances is minimized. We desc...
متن کاملEfficient Enumeration of Phylogenetically Informative Substrings
We study the problem of enumerating substrings that are common amongst genomes that share evolutionary descent. For example, one might want to enumerate all identical (therefore conserved) substrings that are shared between all mammals and not found in non-mammals. Such collection of substrings may be used to identify conserved subsequences or to construct sets of identifying substrings for bra...
متن کاملEfficient algorithms for the longest common subsequence in $k$-length substrings
Finding the longest common subsequence in k-length substrings (LCSk) is a recently proposed problem motivated by computational biology. This is a generalization of the well-known LCS problem in which matching symbols from two sequences A and B are replaced with matching non-overlapping substrings of length k from A and B. We propose several algorithms for LCSk, being non-trivial incarnations of...
متن کاملAn Efficient Algorithm for Finding Similar Short Substrings from Large Scale String Data
Finding similar substrings/substructures is a central task in analyzing huge amounts of string data such as genome sequences, web documents, log data, etc. In the sense of complexity theory, the existence of polynomial time algorithms for such problems is usually trivial since the number of substrings is bounded by the square of their lengths. However, straightforward algorithms do not work for...
متن کاملFinding All Tandem Arrays in DNA Sequences
A tandem array is a substring of the form k x , where x is any unspecific substring and k is at least two (when k is 2, 2 x is also called a tandem repeat or square). A non-extendable tandem array occurring in string S is a tandem array k x which are not followed or preceded by another occurrence of x in S. The problem of this thesis is defined as follows: Given a string S of length n, find all...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2002